Investigating Stochastic Speech Understanding
نویسندگان
چکیده
The need for human expertise in the development of a speech understanding system can be greatly reduced by the use of stochastic techniques. However corpus-based techniques require the annotation of large amounts of training data. Manual semantic annotation of such corpora is tedious, expensive, and subject to inconsistencies. This work investigates the influence of the training corpus size on the performance of understanding module. The use of automatically annotated data is also investigated as a means to increase the corpus size at a very low cost. First, a stochastic speech understanding model developed using data collected with the LIMSI ARISE dialog system is presented. Its performance is shown to be comparable to that of the rule-based caseframe grammar currently used in the system. In a second step, two ways of reducing the development cost are pursued: (1) reducing of the amount of manually annotated data used to train the stochastic models and (2) using automatically annotated data in the
منابع مشابه
Investigating the Processes and Means Involved in EAP Teacher Learning: A Sociocultural Analysis of In-Service Teachers' Experience as They Professionally Learn
There is little understanding of the processes and means involved in teacher learning. Further research is needed to better understand what happens when teachers learn. Therefore, using sociocultural theory (SCT) as a theoretical framework, English for Academic Purposes (EAP) teacher learning was documented and examined. To achieve this goal, an in-depth description of nine in-service EAP teach...
متن کاملIssues in the development of a stochastic speech understanding system
In the development of a speech understanding system, the recourse to stochastic techniques can greatly reduce the need for human expertise. A known disadvantage is that stochastic models require large annotated training corpora in order to reliably estimate model parameters. Manual semantic annotation of such corpora is tedious, expensive, and subject to inconsistencies. In order to decrease th...
متن کاملStochastic language models for speech recognition and understanding
Stochastic language models for speech recognition have traditionally been designed and evaluated in order to optimize word accuracy. In this work, we present a novel framework for training stochastic language models by optimizing two different criteria appropriate for speech recognition and language understanding. First, the language entropy and salience measure are used for learning the releva...
متن کاملAn Approach to Natural Speech Understanding Based on Stochastic Models in a Hierarchical Structure
In this paper, an approach for understanding natural speech by means of two stochastic knowledge bases is presented: Within a given domain, the semantic model generates possible semantic structures, which are semantic representations close to the word level. Corresponding to such a semantic structure, the syntactic model generates word chains using hierarchical Hidden-MarkovModels. Integrated i...
متن کاملA stochastic grammar for isolated representation of syntactic and semantic knowledge
A new form of a grammar is described, which provides two separate sets of stochastic parameters for representing both the semantic and the syntactic knowledge, required for automatic speech understanding. The semantic structure is introduced as an adequate representation of natural spoken, one-sentence command utterances. The constraints and probabilities delivered by the grammar can be integra...
متن کامل